CDNA 
1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 



sequence: 

ATGAGCTTAA 
CATTGAAGGT 
GCGCCCTGCA 
GCTTCCAAGG 
GGCCATTATC 
TGCTGAACCT 
TTTGATAATG 
GATGCCTTTT 
ATTTTATCAC 
ATCAAGTTGT 
TTATACTCAG 
AGTTTGAGAA 
TACAAAGCAG 
GAAATACTTT 
TCACGGAACC 
CCTCCAAGGA 
TTACTGCTTG 
AGGGGCGCAT 
AAATTCTTTC 
ATTGAACTGG 



AGTGTCTCTG 
CGACTGGGTG 
GTTTAAGCGT 
AGCAGGTTCG 
ATTGGGGTGA 
ACATCCGGCA 
ATGAGAATTA 
TCCTACCCTC 
AGAGGAGGTT 
TGATCATTGT 
GTGCTAGAGG 
GCTGGCCATA 
TAAGAACCAG 
CCAATTGAGC 
TCTGCCAGAA 
TAAGTCAATA 
CGGTTTAATA 
TCATCCAGAG 
ATCCTTTTAA 
CCCTAAGGGC 



TCTTGCTTGC 
GAGCCCGCAC 
GGCCTGCTGC 
CCTCCATGAC 
GGAAAGGAGG 
GTAGTCAAAG 
TGGTAAGGGC 
AGCAAATCAC 
CCAGAAAGGA 
CAGGGAGCCA 
GGAAGGAGAG 
GACCCTAATA 
CATCTACACC 
AATTTCATGT 
CTTCAGCTCG 
CAATTTATAC 
TTATCTTTAA 
GTGGACCCCT 
TCAAAAATTT 
(SEQ ID NO: 



AGGCTACAAC 
TCAGGCTGAA 
ACGAGTTCCG 
CTGGTCCAGC 
CACAAGGGCC 
CCTCTCAAGA 
ATTGAGTGGT 
AATTGAAAAG 
TTTACAAAAT 
ACCACAAGAG 
GAAGAACAAA 
CATGCGAAGT 
AAACATCTGG 
CGTCGATGGA 
TGGAGAAGTT 
TTCAATGCTA 
TAAGTGCCTG 
CTGTCATTAC 
TACCAGATCA 
1) 



CCATTTGCCC 
TTCCCACTTC 
GAAGGGCAAC 
AGCTCCCCAA 
CTGCTTGAAA 
AATCCACTTT 
ATAGGAAAAA 
AGCCCAGCAT 
GAACTCATCC 
CTATTTCTGA 
ACTTATTACA 
GAACACAAAA 
AAAGGTGGTT 
GATCGCCTCA 
CCTAAATCTG 
CCAGAGGGTT 
GCGGGCAGCA 
TAAATTGCGC 
CTGGGAGGAC 



FEATURES: 
5'UTR: 

Start Codon: 1 
Stop Codon: 964 
3*UTR: 967 

HOMOLOGOUS PROTEIN: 
Top 10 BLAST Hits: 



91 

91 
91 
91 
91 
91 
91 
91 
91 



4826764 
6754246 
9957244 
7293568 
5174463 
5174467 
5174465 
9055264 
4835727 
7503118 



ref |NP_005105.1| heparan sulfate (glucosamine) 3-O-s.. 
ref |NP_034604.1| heparan sulfate (glucosamine) 3-O-s.. 
gb | AAG09283.il (AF177430) 3-O-sulfotransf erase [Ratt.. 
gb | AAF48941.il (AE003511) CG7890 gene product [Droso.. 
ref NP_006034.1| heparan sulfate (glucosamine) 3-O-s.. 
ref NP_006032.1| heparan sulfate (glucosamine) 3-O-s.. 
ref NP_006033.1| heparan sulfate (glucosamine) 3-O-s.. 
ref NP_061275.1| D-glycosaminyl 3-O-sul fotransf erase. . 
gb|AAD30210.1|AFl05378_JL (AF105378) heparan sulfate .. 
pir||T33493 hypothetical protein F40H3.5 - Caenorhab.. 



EST 



gi 
gi 



303 
297 
293 
251 
248 
247 
247 
240 
235 
187 



le-81 
le-79 
3e-78 
7e-66 
6e-65 
le-64 
le-64 
2e-62 
6e-61 
2e-46 



6990086 /dataset=dbest /taxon=960.. 
5362532 /dataset=dbest /taxon=9606 
4682357 /dataset=dbest /taxon=9606 



672 0.0 
660 0.0 
571 e-161 
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EXPRESSION INFORMATION FOR MODULATORY USE: 

gi 1 6990086 / lung 

gi j 5362532 / lung carcinoid 

gi 14682357 /lung carcinoid 

Tissue expression 
Human Brain 
Human bone marrow 
Human colon 
Human fetal brain 
Human fetal heart 
Human fetal liver 
Human fetal lung 
Human pancreas 
Human placenta 
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1 MSLKCLCLAC RLQPICPIEG RLGGARTQAE FPLRALQFKR GLLHEFRKGN 
51 ASKEQVRLHD LVQQLPKAII IGVRKGGTRA LLEMLNLHPA WKASQEIHF 
101 FDNDENYGKG IEWYRKKMPF SYPQQmEK SPAYFTTEEV PERIYKMNSS 
151 IKLLIIVREP TTRAISDYTQ VLEGKERKNK TYYKFEKLAI DPNTCEVNTK 
201 YKAVRTSIYT KHLERWLKYF PIEQFHWDG DRLITEPLPE LQLVEKFLNL 
251 PPRISQYNLY FNATRGFYCL RFNIIFNKCL AGSKGRIHPE VDPSVTTKLR 
301 KFFHPFNQKF YQUGRTLNW P (SEQ ID NO: 2) 



FEATURES: 

Functional domains and key regions: 
Prosite results: 

[1] PDOC00001 PS00001 ASNLGLYCOSYLATTON 
N-glycosylation site 

Number of matches: 4 

1 50-53 NASK 

2 148-151 NSSI 

3 179-182 NKTY 

4 262-265 NATR 



[2] PDOC00005 PS00005 PKC_PHOSPHO_SITE 
Protein kinase C phosphorylation site 

Number of matches: 4 

1 2-4 SLK 

2 150-152 SIK 

3 161-163 TTR 

4 314-316 TGR 



[3] PDOC00006 PS00006 CK2_PHOSPH0lSTTE 
Casein kinase II phosphorylation site 

27-30 TQAE 



[4] PDOC00008 PS00008 MYRISTYL 
N-myristoylation site 

Number of matches: 3 

1 23-28 GGARTQ 

2 72-77 GVRKGG 

3 76-81 GGTRAL 
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BLAST Alignment to Top Hit: 

Score = 774 (277.5 bits), Expect = 6.0e-76, P = 6.0e-76 
Identities = 137/258 (53%), Positives = 189/258 (73%), Frame = +2 

>TRM | 014792 | 014792 /def="HEPARAN SULFATE 3-0-SULFOTRANSFERASE-l PRECURSOR. 

/org=Homo_sapiens /date=01-MAY-2000 /mol_type=PRT /len=307 
/gene_name=30ST /mol_weight=35773 / seq_status=PRELlMlNARY 
Length = 307 

QQLPKMIIGVRKGGIT^LLEMLNLHPAWKASQEIH FFDNDENYGKGI EWYRKKMPFSY 
QQLP+ IIIGVRKQGTRALLEML+LHP V A E+HFFD +E+Y G+ WY -+MPFS+ 



P QfT+EK+PAYF + +VPER+Y MN SI+LL+I+R+P+ R +SDYTQV +K+K Y 



E+ + +N YKA+ S+Y H+4- WL++FP+ H+VDGDRLI +P PE+Q 



VE+FL L P+I+ N YFN T+GFYCLR + ++CL SKGR HP+VDP ++ KL ++ 



FH NH-KF+++ GRT +W 



Hmmer search results (Pfam): 

Scores for sequence family classification (score includes all domains): 
Model Description Score E-value N 



yuci y • 


191 


jU J V- L • 




Query: 


371 


Sbjct: 


112 


Query: 


551 


Sbjct: 


172 


Query: 


731 


Sbjct: 


230 


Query: 


911 


Sbjct: 


289 



PF02004 Protein of unknown function 4.3 6.5 

Parsed for domains: 

Model Domain seq-f seq-t hmm-f hmm-t score E-value 



PF02004 1/1 133 147 .. 266 280 .] 4.3 6.5 
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1 ATTAGCTTCC AATCATTTAC CTTTTACTTA GTAATTGATC TAATGATCAC 
51 TAATGCATTA TTATTTAGTT GATGATTCTT TTCAI I I I I I TAACTCTGTC 
101 TCTAGTCTCT AAGGGGATAG CTTTTATTTG GAATTGAATT GTTTGGTGGG 
151 CTTTCTAAAA GCCTCTCACT TCAGACTTTG AGATTATGTC TGAAGGTAAC 
201 AGGCTTATTT AGGCCCACTC TCCAGTAACT GAAGACCCTG CTTTCTGGGA 
251 GGGAGACAGA GGTTACTTCT ACCATCCCTT CCAATCCTAA ACCTGTATGA 
301 I I I I ICAGTC TGGGACCCAT ACTCAGAATC CATGCTTTCA GAAGTGGGAA 
351 AGAATATGAT ATTTTCTCAA ATTTTCACAT TCTATCTTGA GTTAGGGAGT 
401 CCAAAAAGCG ACTATTCTGC AGGATGTGAT CTCCCAGGGT AGAAGATAGA 
451 AAGAGGAAGG AAGTAAAGAA GGAAAATGAC CCTTTCTACA AGTGGGGAAA 
501 TTCCATTTGA CCTCAAACAA AGCAGAGACT GTCTATATCA GCCACTCTCA 
551 GCCAGGGTAC TATGAAAGAA TTAAATCCTA CAAAAAAGAA TTTGAGTGAC 
601 TGTTTCCTCA ATTCTTCCAA GGATGGTACT AGCATCATTC TAGGTGCTTA 
651 GGACAGAAAT CCATCAATGG ATGCCTTATG GAATTAGAGC TTAATTCTCA 
701 ACCAGAACCC AAGAAGAACT GAAAGATGAA CTTGTATTAT TCCAATCAGT 
751 GTCACAATTA AAAGCATCTT TGCCTATGTA TCTATTGATA ATTTTACATC 
801 CTCCATTTAA AGCCCTAGTA CATTAATCTC ATTAACAAAT TTATAAAAAC 
851 AAAATTCATG TTTCTCTAAA CTATTAACCG GGTTAAATCC TGI I I I I IAA 
901 AAGCTGTCTA GGCCAGGCAC AGTAGCTCAC GCCTGTAATC CCAGCACTTT 
951 GGGAGGCTGA GGCAGGCGAA TCACGAGATC AGGAGTTCAA GACCAGCCAG 
1001 GCCAACATGG TGAAACCTTG TCTCTACTAA AAATACAAAA ATTAGCTGGG 
1051 TATGGTGGCG CAGGCCTGTA ATCCCAGCTA CTCGGGAGGC TGAGGCAGGA 
1101 GAATCTCTTG AACCCAGGAG ACAGAGATTG CAGTGAGCCA AGATCGTGCC 
1151 ACTGCACTGC AGCCTAGGCA ACAGACCAAG ACTCCGTCTC AAAAAAAAAA 
1201 GAAAA AAAAG TTGTCTATAT TTTCACACTT TCCACAATGA GCATGAGTTG 
1251 TTTTAAAAAT CATAAAAAAG AAACATCGTG AAAAGTAGTA TACATTGATA 
1301 I I I I ICCTTA AGCATTATGA TAGATAGCTG TTTAAACAGA ACAAAGACCA 
1351 AGACCATGCT CCTCAATTCT GCAGAACAGG CTGAGTGTAT TAGTCCGTTT 
1401 TCACAGTGCT ATAAAGACAT ACCTGAGACT GAGTAATTTA TAAAGAAAAA 
1451 AGGTTTAATT GACACACAGT TCTGCATGGC TGGGGAAGCC TCAGAAAACT 
1501 TACAATCATG GCAGAAGGCA AAGAAGAAGC AAGGCACGTC TTACTTGGTG 
1551 GCAGGAGAGA GAGGGAGCTT GCAGGGGGCG GTGCCACACA GTTTTAAACC 
1601 ATCAAATCTC ATGAGAACTC ACTATCATGA AAACAAGGGG TAAATACACC 
1651 CCCATAATCC AGTCACCTCC CACCAAGCCC CTCCTCCGAC ATGTGGGGAT 
1701 TACAATTCGG GATGAGATTT GGGTGGGGGC ACAGAGCCAA ACCATATCAC 
1751 TGGGCATGAC CTTGAGGTTG TTTCTCATCT CAGAAAACAA GAAAGATGCA 
1801 ATACAGTCTC TTGGGAAAAG CAAGCAACAG CCTCATTGCC ACAGAGGGGG 
1851 AGACACAGAT TCCAAATTAT TAGAATAACT GGAAGCTTTC AAGTGTAAGA 
1901 ATTGGTTTAA CAGCLI I I I I GACTGATATT ATTTAATTTT ACCAAGAAGG 
1951 CTAAAATGCC CTCACAGATC AACTTAGGGG AATTATAATG AACTTCAGTT 
2001 CAATTCAGAC TATACCTAAA AGGAAACTCA ATTTGCTAAC CATATATGTT 
2051 AGCCATGACA AATTAAACAG TCACCATCGT CTACTATCAT TGTGACTGTT 
2101 ACCACATCTT TCTCCCTGAG AAAAGCAGAG ATGGTTGTTC ACTATTCAGG 
2151 ATAATACTGA AGTGGAAATC CTCCTGTCTG GCTATATCCA TTGCACTCCT 
2201 TCCTTAATGA GATTGAGTTC CTGATTTTAA TGGGCTTGGC AATGAGGGCT 
2251 TGAGGTTTCT GGCCCTGTCA AGGTCTTGTT GATGCCTGGT CCCAGGTGTG 
2301 GTAGGTGATA TACAGCACTT GCTGATGGCA ATTGGGTTTG ATTCTATATT 
2351 CAGCAAAGTG GATATATAAT CCTGACCTCT TTAGATAGAA AGAGAAAGAG 
2401 AGGCAGAAGA AATATAGTAT TCTTCTGGCT ATCCTCAAGG CCCAGGGCAG 
2451 AGAGTCTCAG AATGAAAATC TCAGCAAGTT CCAAGATTGG AATTTTGCAG 
2501 GTTGATGATG C AAACAG CCC GGGGCAGAAA CTGGGACCTC CTTTCAGATT 
2551 ATATCTCAAA GATTTTCAAG AGCCATCTGA GTGCTGCCGA GCTGCAAGAA 
2601 AATAATACCA CACAAAATGT GAAACACATG GCCTCCCTGC TACCCTTCCA 
2651 CCTCCCAGCT GAAGATTATA ATCTCCTGCC TTTCACTTTT TCTTAATGAT 
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2701 TTTAACTGGT GAGCTGTTAA AAAGCTATTA GTATGGCTGG TGCCACTTGT 
2751 CTATCCTGTA CTGCAAACAG AAGTGCACGC CGTAGTCAAT TAAGTGCTTG 
2801 GAGAATAAAA AATTTTAAGG AGCACTAATA AAAAAATTCA TCAATTATGT 
2851 GTGCTCCATT TAATACATGG TTGCTTAAAA TAAAATTTCC CAAACATATG 
2901 TTCATTATGG ATTGCAGCAG GCTGGGAACC AGTGGCTTTA TTTATGCATT 
2951 TAAAGTCTTG GTCTGACTGG GGAACCAGAA AAATGAAAAG TTAGTTGCAA 
3001 TGAGCTTAAA GTGTCTCTGT CTTGCTTGCA GGCTACAACC CATTTGCCCC 
3051 ATTGAAGGTC GACTGGGTGG AGCCCGCACT CAGGCTGAAT TCCCACTTCG 
3101 CGCCCTGCAG TTTAAGCGTG GCCTGCTGCA CGAGTTCCGG AAGGGCAACG 
3151 CTTCCAAGGA GCAGGTTCGC CTCCATGACC TGGTCCAGCA GCTCCCCAAG 
3201 GCCATTATCA TTGGGGTGAG GAAAGGAGGC ACAAGGGCCC TGCTTGAAAT 
3251 GCTGAACCTA CATCCGGCAG TAGTCAAAGC CTCTCAAGAA ATCCACTTTT 
3301 TTGATAATGA TGAGAATTAT GGTAAGGGCA TTGAGTGGTA TAGGAAAAAG 
3351 ATGCCI I I I I CCTACCCTCA GCAAATCACA ATTGAAAAGA GCCCAGCATA 
3401 TTTTATCACA GAGGAGGTTC CAGAAAGGAT TTACAAAATG AACTCATCCA 
3451 TCAAGTTGTT GATCATTGTC AGGGAGCCAA CCACAAGAGC TATTTCTGAT 
3501 TATACTCAGG TGCTAGAGGG GAAGGAGAGG AAGAACAAAA CTTATTACAA 
3551 GTTTGAGAAG CTGGCCATAG ACCCTAATAC ATGCGAAGTG AACACAAAAT 
3601 ACAAAGCAGT AAGAACCAGC ATCTACACCA AACATCTGGA AAGGTGGTTG 
3651 AAATACTTTC CAATTGAGCA ATTTCATGTC GTCGATGGAG ATCGCCTCAT 
3701 CACGGAACCT CTGCCAGAAC TTCAGCTCGT GGAGAAGTTC CTAAATCTGC 
3751 CTCCAAGGAT AAGTCAATAC AATTTATACT TCAATGCTAC CAGAGGGTTT 
3801 TACTGCTTGC GGTTTAATAT TATCTTTAAT AAGTGCCTGG CGGGCAGCAA 
3851 GGGGCGCATT CATCCAGAGG TGGACCCCTC TGTCATTACT AAATTGCGCA 
3901 AATTCTTTCA TCCTTTTAAT CAAAAATTTT ACCAGATCAC TGGGAGGACA 
3951 TTGAACTGGC CCTAAAATAA TATGTCATAC AACACTATGT GTTGTGCCTG 
4001 GAGACACACA ATGTCTCCTG TAGATTAAAA TATGCACTTT TCCTAGGCAG 
4051 AGCTATCCAA GTCAI I 1 1 IC CATGTATATT TGTACATACG CAGTGTGTGA 
4101 CCAAATATAA GATCAGTTCT TTTTCTACTG AAAATTTACG AAAAAAAAAA 
4151 AATTGCTGTC TGCATAGTCG CATCTTTTAA GCTATTTACA AAAGAGAAGA 
4201 GGTGGTGGTA TTGGGGGAAA GTGACTTCAG CTATTCTCAA AGAGTTAGTC 
4251 TTCCTTTGAT TCAGAATTTG TCACCCGCCA TTTTCATAGA TTTAAGCCAA 
4301 AAGATAAATG TGTGAAAATG TACCAATGGC TGCGAAGCTT CAGGAAGTAG 
4351 AGGATCCAGT GATGCATTTT I 1 1 I I ICCTA AGGGAAAGCT GGGTCTTTAA 
4401 TTCAGATGCT GAATTGGTGC CATGAAAACA GAAAATGCTA I I I IU I ATT 
4451 ATTTAAAAGA ACGTCTTATC TCATAAAATT GACATTGTTC CAAAGTTCTT 
4501 GTGGTGATTT TGCACTATTG TTTTCTCGTA TGGACCATGG TGTCACTTGT 
4551 AGCATGTCAA TCACACATTG GAAAGTCAAG TCCI I I I ACT TCCATGTTGT 
4601 ATGTCAACAG AGAGAAATGT CATGTACATA ATGTATATTG TTGTAAATAC 
4651 TGGTTTCACA CTAAGTAATT CTATTTTGTA AACTGAATAT GGCTATTTAA 
4701 TTTATTGTGA AAATTAAATT TATTGTGGTA TTTAAAAATG GAATGGATTA 
4751 AAATTACTCT ATGTGCAATT I I I I I I I I I I TTACTCATTT TGTTTTACGT 
4801 GCCCCCTGCT GGCTTCCAAA ATGGAAGCTG TTTACGTGCA TATGAGAGCA 
4851 CTTGGAAAGA TGTGCTTCCC TGCTGGATTT CTGTACCCCA GTGAAAATGT 
4901 ATTTATGAAG TGAGGTTGAG TATATTAAAA AAGAAAAACC TCAACCATCT 
4951 GGAAATCAAG TATAATAGCC ACCTCAAAGA ACCCTAGTGC TGCTCTGCTA 
5001 CAACTTTGTA ACAATTAATT TACTCGCAGT TGCTGCTGCT CAGG (SEQ ID NO: 3) 



FEATURES: 
Start: 3000 
Exon: 3000-3962 
Stop: 3963 
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SNPs: 



DNA 

Position Major 


Minor 


Domain 


Protei n 

Position Major 


Minor 


155 C 
370 A 
2775 G 
4240 A 


G 
T 
A 
G 


Beyond ORF(5') 
Beyond ORF(5*) 
Beyond ORF(5') 
Beyond ORF(3') 







Context: 
DNA 

Position 

155 ATTAGCTTCCAATCATTTACC I I I I ACTTAGTAATTGATCTAATGATCACTAATGCATTA 

TTATTTAGTTGATGATTCTTTTCAI I I I I I I MCTCTGTCTCTAGTCTCTAAGGGGATAG 
CTTTTATTTGGMTTGMTTGTTTGGTGGGCTTT 
[C,G] 

TAAMGCCTCTCACTTCAGACTTTGAGATTATGTCTGMG 
CACTCTCCAGTMCTGMGACGCTGCTTTCTGGGAGGGAGACAGAGGTT^ 
CCCTTCCAATCCTAAACCTGTATGA I I I I I CAGTCTGGGACCCATACTCAGAATCCATGC 
TTTCAGMGTGGGAMGMTATGATATTTTCTCA^ 

GGAGTCCAAAMGCGACTATTCTGCAGGATGTGATCrCCCAGGGTAGAAGATAGAMGA 

370 TGATGATTL I I I ICAI I I I I I I AACTCTGTCTCTAGTCTCTAAGGGGATAGCI I I IATTT 

GGAATTGAATTGTTTGGTGGGCTTTCT 

CTGMGGTMCAGGCTTATTTAGGCCCACTCTCCAGTMCTGAAGACCCT 
AGGGAGACAGAGGTTACTTCTACCATCCCTTCCAATCCTAAACCTGTATGA I 1 1 I ICAGT 
CTGGGACCCATACTCAGMTCCATGCTTTCAGMGTGGGAAAGMTATGA 
[A,T] 

ATTTTCACATTCTATCTTGAGTTAGGGAGTCO 

CTCCCAGGGTAGMGATAGAAAGAGGMGGAAGTAAAGMGGAAMTGACCCTTTCTACA 
AGTGGGGAMTTCCATTTGACCTCAMCAMGCAGAGACTGTCTATAT 
GCCAGGGTACTATGAAAGAATTAMTCCTACAAAAMGMTTTGAGTGACTGTT^ 
ATTCTTCCMGGATGGTACTAGCATCATTCTAGGTGCTTAGGACAGAMTCCA^ 

2775 CAAGTTCCMGATTGGMTTTTGCAGGTTGATGATGCAMCAGCC 

GACCTCCTTTCAGATTATATCTCAMGATTrrCMGAGCCATCTGAGTG^ 
CAAGAAMTAATACCACACAAMTGTGAAACACATGGCCTCCCTC<TACCCTTCG^CCT 
CCAGCTGMGATTATAATCTCCTGCCTTTCACI I I I I <_ I IAATGATTTTAACTGGTGAGC 
TG7TAAAMGCTATTAGTATGGCTGGTGCCACTTGTCT ATCCTGTACTGCAAACAGAAGT 
[G,A] 

CACGCCCTACTCMTTMCTGCTTGGAGAATAAAAMTTTTMGGA 
ATTCATCMTTATGTGTGCTCCATTTMTACATGGTTGCTTAAMTAAM 
ATATGTTCATTATGGATTGCAGCAGGCTGGGMCCAGTG^ 
TCTTGGTCTGACTGGGGMCCAGAAA^ 

TCTGTCTTGCrTGCAGGCTACAACCCATTTGCCCCATTGAAGGTCGACT 

4240 CTGGGAGGACATTGMCTGGCCCTAAMTAATATGTCATACAACACTAT^ 

GGAGACACACMTGTCTCCTGTAGATTAAAATATGCAC I I I I CCTAGGCAGAGCTATCCA 
AGTCAI I I I I CCATGTATATTTGTACATACGCAGTGTGTGACCAMTATAAGATCAGTTC 
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I I I I I CTACrGAAMTTTACGAAAAAAAAAAMTTGCrGTCTGCATAGTCGCATCrrTT A 
AGCTATTTAOWV\GA(W\GAGGTGGrGGTATTGGGGGAMGTGACrrG\GCrATTCTCA 
[A,G] 

AGACTrrAGTCrTCCrrTGATTCAGMTTTGTCACCCGCCATTTTC^ 
AAGATAAATGTGTGAAMTGTACCMTGGCTGCGMGCTTG\GGAAGTAGAGGATCCAGT 
GATGCAI I I I I I I I 1 1 CCTMGGGAMGCTGGCTCTTTMTTCAGATGCTGMTTGGTGC 
CATGAAAACAGAAAATGCTA I I I I C_ 1 1 ATTATTTAAAAGAACGTCTTATCTCATAAAATT 
GACATTGTTCCAAAG I ILl ICTGGTGATTTTGCACTATTGTTTTCTCGTATGGACCATGG 

Chromosome map: 
chromosome 6. 
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